Compiler CH5 Top-Down Parsing

Compiler CH5 Top-Down Parsing

objectves of Top-Down Parsing

an attempt to find a leftmost derivation for an input string

an attempt to construct a parse tree for the input string starting from the root and creating the nodes of the parse tree in preorder

2 forms

  • Recursive-descent parsers
    • contain a set of mutually recursive procedures that cooperate to parse a string.
    • Code for these procedures can be written directly from a suitable grammar.
  • Table-driven LL parsers
    • use a generic LL(k) parsing engine and a parse table that directs the activity of the engine.
    • The entries for the parse table are determined by the particular LL(k) grammar. The notation LL(k) is explained below.

basic method of recursive-descent

using EBNF(擴展巴斯科範式)

  • Consider now the case of an exp⁡ in the grammar for simple arithmetic expressions in BNF:
    • 𝑒𝑥𝑝→exp⁡〖𝑎𝑑𝑑𝑜𝑝 𝑡𝑒𝑟𝑚 | 𝑡𝑒𝑟𝑚〗(可能不斷走exp)
  • The solution is to use the EBNF rule
    • 𝑒𝑥𝑝→𝑡𝑒𝑟𝑚{𝑎𝑑𝑑𝑜𝑝 𝑡𝑒𝑟𝑚}

Approaches of Top-parsing

with backtracking (making repeated scans of the input, a general form of top-down parsing)
 
Methods: To create a procedure for each nonterminal.

  • 為每個終止符創造procedure

problems?

  • left-recursion (can cause a top-down parser to go into an infinite loop)

    • Def. A grammar is said to be left-recursive
      • if it has a nonterminal 𝐴 s.t. there is a derivation 𝐴⟹𝐴𝛿 for some 𝛿.
  • backtracking - undo not only the movement but also the semantics entering in symbol table.

    • 回溯不僅撤銷動作,也撤銷進入符號表的語意

the order the alternatives are tried (For the grammar shown above, try 𝑤=𝑐𝑎𝑏𝑑 where 𝐴→𝑎 is applied first)

  • 我看不懂…

LL(1) predict function

single symbol lookahead

The limitation of LL(1)

  • LL(1) contains exactly those grammars that have disjoint predict sets for productions that share a common left-hand side
  • 會包含公共左側且不相交的預測集

A grammar G is LL(1) if and only if whenever 𝐴→𝛼|𝛽 are two distinct productions of G, the following conditions hold:

  1. For no terminal 𝒂 do both 𝛼 and 𝛽 derive strings beginning with 𝒂.
  2. At most one of 𝛼 and 𝛽 can derive the empty string.
  3. If 𝛽⇒ then 𝛼 does not derive any string beginning with a terminal in FOLLOW(𝐴).
    • Likewise, if 𝛼⇒ then 𝛽 does not derive any string beginning with a terminal in FOLLOW(𝐴)

First set

Ex.

Follow set

Ex.

LL(1) prediction function

A grammar 𝐺 is LL(1) if and only if whenever 𝐴→𝛼|𝛽 are two distinct productions of 𝐺, the following conditions hold:

  1. For no terminal 𝑎 do both 𝛼 and 𝛽 derive strings beginning with 𝑎. First(𝛼)∩First(𝛽)=𝜙
  2. At most one of 𝛼 and 𝛽 can derive the empty string.
  3. If 𝛽⟹, then 𝛼 does not derive any string beginning with a terminal in Follow(𝐴).
    • Likewise, if 𝛼⟹, then 𝛽 does not derive any string beginning with a terminal in Follow(𝐴).
    • First(𝛼)∩Follow(𝐴)=𝜙 (i.e. If First(𝛼) contains 𝜆 then First(𝛽)∩Follow(𝐴)=𝜙 )

end of file token

  • $

LL(1) Parse Table

  • An LL(1) parse table
  • The definition of 𝑇
    • 𝑇[𝐴][𝑡]= if 𝑡∈
    • 𝑇[𝐴][𝑡]=Error, otherwise

LL(1) Parsing
* 𝑆→(𝑆)𝑆

  • 𝑆→𝜆
  • Input String: ()
  • then
    • 𝑆⟹${lm} (𝑆)𝑆⟹{lm} ()𝑆⟹_{lm} ()$

Elimination of left recursion

  • It is possible for a recursive-descent parser to loop forever. A problem arises with “left-recursive” productions like
    • expr → expr + term
  • A left-recursive production can be eliminated by rewriting the offending production. Consider a nonterminal A with two productions
    • A → A𝛼|𝛽
    • For example, A= expr, 𝛼= + term, 𝛽= term

We can convert left recursion to right recursion in the following manner, using a new nonterminal R:

  • A →𝛽R
  • R → 𝛼R|ϵ

Immediate left recursion can be eliminated by the following technique, which works for any number of 𝐴-productions. First, group the productions as

  • 𝐴→

where no 𝛽_𝑖 begins with an 𝐴. Then, replace the 𝐴-productions by

  • 𝐴→
  • 𝐴′→

algorithm: Elimination left recursion
Input: Context-free Grammar G with no cycles or 𝜆-production
Methods:

  1. Arrange the nonterminals in some order
  2. for 𝑖=1 to 𝑛 do
    {
    	for 𝑗=1 to 𝑖−1 do 
    	{
    		replace each production of the form 𝐴_𝑖→𝐴_𝑗 𝛾 by the production 𝐴_𝑖→𝛿_1 𝛾|𝛿_2 𝛾|⋯|𝛿_𝑘 𝛾, where 𝐴_𝑗→𝛿_1 |𝛿_2 |⋯|𝛿_𝑘 are all current 𝐴_𝑗-production;
        } 
    	eliminate the immediate left-recursion among the 𝐴_𝑖-production;
    }

Ex.
e.g.
𝑆 →𝐴𝑎 | 𝑏
𝐴 → 𝐴𝑐 | 𝑆𝑑 | 𝑒

  • Step 1: ==> 𝑆 → 𝐴𝑎 | 𝑏
  • Step 2: ==>  𝐴 → 𝐴𝑐 | 𝐴𝑎𝑑 | 𝑏𝑑 | 𝑒
  • Step 3: ==>  𝐴 → 𝑏𝑑𝐴′ |𝑒𝐴′ 𝐴′ → 𝑐𝐴′ |𝑎𝑑𝐴′ |

Non-backtracking(recursive-descent) parsing

recursive descent : use a collection of mutually recursive routines to perform the syntax analysis.

Predicative Parsing

Features:

  • maintains a stack rather than recursive calls
  • table-driven
    Components:
  1. An input buffer with end marker ($)
  2. A stack with endmarker ($) on the bottom
  3. A parsing table, a two-dimensional array 𝑀[𝐴,𝑎]
    • where ‘𝐴’ is a nonterminal symbol and ‘𝑎’ is the current input symbol (terminal/token).

Parsing Table

  • an ex.

Algorithm

input

  • an input string and a parsing table for grammar G

ouput

  • a leftmost derivation of or an error indication

initially w is in the stack

  • Method

construct a Predicative parsing table

LL(1) grammar

A grammar whose parsing table has no multiply-defined entries is said to be LL(1).

First ‘L’ : scan the input from left to right.
Second ‘L’ : produce a leftmost derivation.
‘1’ : use one input symbol to determine parsing action.

  • No ambiguous or left-recursive grammar can be LL(1).

Compiler CH5 Top-Down Parsing
https://z-hwa.github.io/webHome/[object Object]/Compiler/Compiler-CH5-Top-Down-Parsing/
作者
crown tako
發布於
2024年4月18日
許可協議